R is a programming language and environment often used for statistical analysis and data visualization.
Download R software: R is a free, open-source software that can be installed directly from the R project.
Download RStudio: RStudio IDE is available via Posit.
Create new R files: open R Script, R Markdown, or Quarto (or other) file types by clicking “File” > “New File” > “R Script,” “R Markdown,” or “Quarto Document”.
Code in R basics:
Objects are values saved for future use and can be created using the <- assignment operator, e.g., pi <- 3.1459 or name <- “Wharton AI and Analytics Initiative”.
Data-frames are tables similar to Excel spreadsheets, where each row represents an observation, each column represents a variable, and each cell represents a value.
Variables can take different types in R. A few common data types you might encounter in R are:
Functions are commands used to act on your data with underlying code to complete specific tasks in R similar to formulas in Excel, e.g., mean(c(1, 2, 3, 4)). You can write your own functions or use pre-packaged ones from R packages.
Access pre-packaged R functions: additional R packages beyond base R for more useful functions are available via the Comprehensive R Archive Network, or CRAN.
Comments are notes written in code following a # symbol
A couple of useful examples (among many R capabilities!):
Real-time access to public, openly-accessible data with web scraping
Create instant, interactive web reports
Automatically generate and update PowerPoint decks
#load packages with comment on their uses
library(tidyverse) #data organization
library(rvest) #scraping
library(polite) #scraping
library(skimr) #EDA
library(psych) #EDA
library(ggpubr) #EDA
library(plotly) #EDA
library(officer) #Microsoft Office functions
library(MetBrewer) #palettes
library(rmarkdown) #Rmd formatting
library(knitr) #Rmd formatting
# save a color palette
MetBrew_Cross <- MetBrewer::met.brewer("Cross", n = 50)
## install packages by running install.packages("PACKAGE-NAME")
## cite packages by running citation(package = "PACKAGE-NAME")
## load packages by running library(PACKAGE-NAME)
## specify function from a specific package by running PACKAGE-NAME::function()
## troubleshoot functions from packages by running ?PACKAGE-NAME::function()
Why it matters: With R, you can continuously monitor and utilize publicly-available, openly-accessible data with minimal manual effort, enabling rapid strategic responses to help keep your organization agile and informed.
How it works: By using the R polite and rvest packages, you can easily introduce yourself to the host website, access its rules, and scrape public website data (as long as it is permitted by the host). This can be useful for easily pulling in public, business-relevant data for analyses.
In our example, we will looking at GDP and population data for the top 50 largest Metropolitan Statistical Areas (MSAs) in the United States from Wikipedia (Wikipedia Contributors, 2025), which is available at the following URL: https://en.wikipedia.org/wiki/List_of_United_States_metropolitan_areas_by_GDP under a CC-BY-SA 4.0 license.
Note: Some websites restrict scraping access (e.g., only allow it for certain uses) or prohibit it altogether. Before embarking on any web-scraping project, be sure to thoroughly review and adhere to all ethical (e.g., privacy protections) and legal obligations and comply with websites’ Terms of Use. Also, be mindful of your scraping requests so as to not overload websites’ servers.
#save URL and introduce to host
MSA_GDP_url <- "https://en.wikipedia.org/wiki/List_of_United_States_metropolitan_areas_by_GDP" #save URL for website of interest
MSA_GDP_url_bow <- polite::bow(MSA_GDP_url) #introduce user to host website
MSA_GDP_url_bow #check website robots.txt file for web scraping policies; also read Terms of Use on website
## <polite session> https://en.wikipedia.org/wiki/List_of_United_States_metropolitan_areas_by_GDP
## User-agent: polite R package
## robots.txt: 464 rules are defined for 34 bots
## Crawl delay: 5 sec
## The path is scrapable for this user-agent
#scrape table into dataframe
MSA_GDP_table <-
polite::scrape(MSA_GDP_url_bow) %>% #scrape website from polite::bow object
rvest::html_element("table.wikitable") %>% #choose item to be pulled from page
rvest::html_table() %>% #convert html table to dataframe format
base::as.data.frame() #update from tibble to dataframe format
knitr::kable(head(MSA_GDP_table, n = 10)) #see first 10 rows of raw dataframe
| rank | Metropolitan area | State(s) | 2023 | 2022 | 2021 | 2018 | 2017 | 2016 | 2015 | 2014 | 2013 | 2012 | Population (2020)[4] |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | New York-Newark-Jersey City, NY-NJ-PA (Metropolitan Statistical Area) | NY, NJ, PA | 2,298,868 | 2,163,209 | 1,992,779 | 1,772,319 | 1,698,122 | 1,634,671 | 1,577,366 | 1,511,763 | 1,439,043 | 1,401,233 | 20,140,470 |
| 2 | Los Angeles-Long Beach-Anaheim, CA (Metropolitan Statistical Area) | CA | 1,295,361 | 1,227,469 | 1,124,682 | 1,047,661 | 995,114 | 945,600 | 912,384 | 858,170 | 820,353 | 788,081 | 13,200,998 |
| 3 | Chicago-Naperville-Elgin, IL-IN-WI (Metropolitan Statistical Area) | IL, IN, WI | 894,862 | 832,900 | 764,583 | 689,464 | 659,855 | 641,589 | 627,033 | 599,805 | 577,948 | 561,016 | 9,618,502 |
| 4 | San Francisco-Oakland-Berkeley, CA (Metropolitan Statistical Area) | CA | 778,878 | 729,105 | 668,677 | 548,613 | 509,382 | 469,472 | 446,344 | 413,519 | 383,254 | 364,594 | 4,749,008 |
| 5 | Dallas-Fort Worth-Arlington, TX (Metropolitan Statistical Area) | TX | 744,654 | 688,928 | 598,333 | 512,509 | 482,218 | 458,973 | 442,879 | 420,929 | 394,178 | 375,065 | 7,637,387 |
| 6 | Washington-Arlington-Alexandria, DC-VA-MD-WV (Metropolitan Statistical Area) | DC, MD, VA, WV | 714,685 | 660,626 | 607,628 | 540,684 | 515,553 | 500,084 | 481,861 | 460,254 | 448,268 | 442,224 | 6,385,162 |
| 7 | Houston-The Woodlands-Sugar Land, TX (Metropolitan Statistical Area) | TX | 696,999 | 633,185 | 537,066 | 478,778 | 447,521 | 430,444 | 446,486 | 430,726 | 423,766 | 404,431 | 7,122,240 |
| 8 | Boston-Cambridge-Newton, MA-NH (Metropolitan Statistical Area) | MA, NH | 610,486 | 571,666 | 531,671 | 463,570 | 439,144 | 421,783 | 406,675 | 381,353 | 365,670 | 357,087 | 4,941,632 |
| 9 | Atlanta-Sandy Springs-Roswell, GA (Metropolitan Statistical Area) | GA | 570,663 | 525,888 | 473,823 | 397,261 | 385,542 | 369,806 | 347,604 | 326,502 | 307,750 | 291,481 | 6,089,815 |
| 10 | Seattle-Tacoma-Bellevue, WA (Metropolitan Statistical Area) | WA | 566,742 | 517,803 | 479,966 | 392,036 | 356,572 | 334,411 | 317,153 | 296,028 | 280,291 | 267,472 | 4,018,762 |
#light data cleaning
MSA_GDP_table <- MSA_GDP_table %>%
dplyr::rename("Rank" = "2023\nrank",
"2020_Population" = "Population (2020)[4]",
"MSA" = "Metropolitan area",
"States" = "State(s)") %>% #rename some awkward column names
dplyr::mutate(across(c(starts_with("20")), ~gsub(",", "", .))) %>% #remove commas from GDP and population numbers
dplyr::mutate(across(MSA, ~str_trim(str_split_fixed(., ",", 2)[,1]))) %>% #remove repeated string
dplyr::mutate(across(c(4:14), as.numeric)) #change population and GDP to numeric
#make sure data overview makes sense
utils::str(MSA_GDP_table) #data overview and types
## 'data.frame': 50 obs. of 14 variables:
## $ Rank : int 1 2 3 4 5 6 7 8 9 10 ...
## $ MSA : chr "New York-Newark-Jersey City" "Los Angeles-Long Beach-Anaheim" "Chicago-Naperville-Elgin" "San Francisco-Oakland-Berkeley" ...
## $ States : chr "NY, NJ, PA" "CA" "IL, IN, WI" "CA" ...
## $ 2023 : num 2298868 1295361 894862 778878 744654 ...
## $ 2022 : num 2163209 1227469 832900 729105 688928 ...
## $ 2021 : num 1992779 1124682 764583 668677 598333 ...
## $ 2018 : num 1772319 1047661 689464 548613 512509 ...
## $ 2017 : num 1698122 995114 659855 509382 482218 ...
## $ 2016 : num 1634671 945600 641589 469472 458973 ...
## $ 2015 : num 1577366 912384 627033 446344 442879 ...
## $ 2014 : num 1511763 858170 599805 413519 420929 ...
## $ 2013 : num 1439043 820353 577948 383254 394178 ...
## $ 2012 : num 1401233 788081 561016 364594 375065 ...
## $ 2020_Population: num 20140470 13200998 9618502 4749008 7637387 ...
base::nrow(MSA_GDP_table) #number of rows
## [1] 50
base::names(MSA_GDP_table) #column names
## [1] "Rank" "MSA" "States" "2023"
## [5] "2022" "2021" "2018" "2017"
## [9] "2016" "2015" "2014" "2013"
## [13] "2012" "2020_Population"
knitr::kable(head(MSA_GDP_table, n = 10)) #see first 10 rows of cleaned dataframe
| Rank | MSA | States | 2023 | 2022 | 2021 | 2018 | 2017 | 2016 | 2015 | 2014 | 2013 | 2012 | 2020_Population |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | New York-Newark-Jersey City | NY, NJ, PA | 2298868 | 2163209 | 1992779 | 1772319 | 1698122 | 1634671 | 1577366 | 1511763 | 1439043 | 1401233 | 20140470 |
| 2 | Los Angeles-Long Beach-Anaheim | CA | 1295361 | 1227469 | 1124682 | 1047661 | 995114 | 945600 | 912384 | 858170 | 820353 | 788081 | 13200998 |
| 3 | Chicago-Naperville-Elgin | IL, IN, WI | 894862 | 832900 | 764583 | 689464 | 659855 | 641589 | 627033 | 599805 | 577948 | 561016 | 9618502 |
| 4 | San Francisco-Oakland-Berkeley | CA | 778878 | 729105 | 668677 | 548613 | 509382 | 469472 | 446344 | 413519 | 383254 | 364594 | 4749008 |
| 5 | Dallas-Fort Worth-Arlington | TX | 744654 | 688928 | 598333 | 512509 | 482218 | 458973 | 442879 | 420929 | 394178 | 375065 | 7637387 |
| 6 | Washington-Arlington-Alexandria | DC, MD, VA, WV | 714685 | 660626 | 607628 | 540684 | 515553 | 500084 | 481861 | 460254 | 448268 | 442224 | 6385162 |
| 7 | Houston-The Woodlands-Sugar Land | TX | 696999 | 633185 | 537066 | 478778 | 447521 | 430444 | 446486 | 430726 | 423766 | 404431 | 7122240 |
| 8 | Boston-Cambridge-Newton | MA, NH | 610486 | 571666 | 531671 | 463570 | 439144 | 421783 | 406675 | 381353 | 365670 | 357087 | 4941632 |
| 9 | Atlanta-Sandy Springs-Roswell | GA | 570663 | 525888 | 473823 | 397261 | 385542 | 369806 | 347604 | 326502 | 307750 | 291481 | 6089815 |
| 10 | Seattle-Tacoma-Bellevue | WA | 566742 | 517803 | 479966 | 392036 | 356572 | 334411 | 317153 | 296028 | 280291 | 267472 | 4018762 |
Why it matters: Using R Markdown or Quarto, you can rapidly generate sophisticated, interactive reports that impress stakeholders, investors, and internal teams, without any additional resources. By integrating plots and data directly into reports from R, you can also minimize the risk of error, easily update reports, and facilitate reproducibility.
How it works: We will discuss a few concepts that are helpful to create and customize your reports, and lots more information available at R Markdown: The Definitive Guide (Xie, Allaire, & Grolemund, 2023) or Quarto documents guide (from Posit).
YAML metadata: At the very top of an R Markdown or Quarto document and specifies cross-document information (e.g., title), settings (e.g., table of contents), and customization (e.g., visual theme).
Markdown text: Explanatory text written outside of code chunks in R Markdown or Quarto reports can be easily added and customized.
Code: R code chunks written in-line in reports
are created with {r} #code chunk and they can be labeled
and have chunk-specific behavior specified within the curly
brackets.
Create report: Render your report by clicking the “knit” button, and customize knit settings using the drop-down.
| skim_type | skim_variable | MSA | n_missing | complete_rate | numeric.mean | numeric.sd | numeric.p0 | numeric.p25 | numeric.p50 | numeric.p75 | numeric.p100 | numeric.hist |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| numeric | GDP | Omaha | 1 | 0.9 | 70592.44 | 11613.853 | 58032 | 63233.00 | 65867.0 | 76991.00 | 92356 | ▇▇▂▂▂ |
| numeric | GDP | Louisville/Jefferson County | 0 | 1.0 | 71615.60 | 14546.578 | 54813 | 61473.50 | 68312.5 | 80172.75 | 97751 | ▇▇▂▂▅ |
| numeric | GDP | Memphis | 0 | 1.0 | 77402.40 | 13341.510 | 65186 | 67512.00 | 72188.5 | 84057.00 | 102934 | ▇▃▂▁▃ |
| numeric | GDP | Oklahoma City | 0 | 1.0 | 78089.40 | 12199.421 | 63356 | 69761.50 | 73568.5 | 85250.50 | 100054 | ▇▇▂▂▅ |
| numeric | GDP | New Orleans-Metairie | 0 | 1.0 | 81884.40 | 9164.938 | 72811 | 76816.00 | 78555.0 | 81443.50 | 102437 | ▇▅▁▂▂ |
| numeric | GDP | Richmond | 0 | 1.0 | 84443.70 | 15276.246 | 67572 | 73336.50 | 81123.5 | 91659.25 | 116960 | ▇▆▂▂▂ |
| numeric | GDP | Raleigh | 0 | 1.0 | 85059.00 | 26192.653 | 57615 | 66562.25 | 75973.5 | 102132.25 | 133081 | ▇▆▁▂▃ |
| numeric | GDP | Jacksonville | 0 | 1.0 | 85168.30 | 23110.713 | 61519 | 68203.25 | 77688.5 | 96821.75 | 129095 | ▇▆▂▁▃ |
| numeric | GDP | Providence-Warwick | 0 | 1.0 | 86243.80 | 14159.734 | 71042 | 75590.50 | 82093.0 | 94538.25 | 111840 | ▇▃▂▂▃ |
| numeric | GDP | Bridgeport-Stamford-Norwalk | 0 | 1.0 | 91152.70 | 11555.168 | 81498 | 83076.75 | 86164.0 | 96410.00 | 116031 | ▇▁▁▁▁ |
| numeric | GDP | Milwaukee-Waukesha-West Allis | 0 | 1.0 | 96605.60 | 15613.576 | 87394 | 88642.75 | 89083.0 | 92418.00 | 130857 | ▇▁▁▁▁ |
| numeric | GDP | Salt Lake City | 0 | 1.0 | 96624.70 | 27562.002 | 70114 | 76571.50 | 86455.5 | 112447.00 | 147519 | ▇▃▁▂▃ |
| numeric | GDP | Virginia Beach-Norfolk-Newport News | 0 | 1.0 | 100007.60 | 13943.982 | 85485 | 90247.25 | 95588.0 | 105544.25 | 127459 | ▇▆▂▂▂ |
| numeric | GDP | Hartford-East Hartford-Middletown | 0 | 1.0 | 101149.50 | 10352.398 | 92699 | 93803.25 | 96980.0 | 104746.50 | 122805 | ▇▁▁▁▁ |
| numeric | GDP | Las Vegas-Henderson-Paradise | 0 | 1.0 | 119247.70 | 30657.011 | 87098 | 96240.00 | 109870.0 | 132754.25 | 178388 | ▇▆▂▁▃ |
| numeric | GDP | San Antonio-New Braunfels | 0 | 1.0 | 128905.70 | 27698.475 | 93330 | 111489.75 | 122914.5 | 141696.25 | 182139 | ▇▇▅▂▂ |
| numeric | GDP | Cleveland-Elyria | 0 | 1.0 | 131085.30 | 18919.419 | 106751 | 119715.25 | 126920.0 | 137287.25 | 173134 | ▇▇▅▂▂ |
| numeric | GDP | Columbus | 0 | 1.0 | 131534.30 | 27448.808 | 104000 | 111965.50 | 121719.0 | 148213.75 | 182087 | ▇▃▁▂▃ |
| numeric | GDP | Kansas City | 0 | 1.0 | 134950.90 | 26269.478 | 107028 | 117520.00 | 125621.0 | 148921.75 | 185746 | ▇▆▁▃▂ |
| numeric | GDP | Nashville-Davidson—Murfreesboro—Franklin | 0 | 1.0 | 135195.60 | 37423.427 | 96274 | 109484.50 | 122272.5 | 155323.75 | 204861 | ▇▆▁▂▃ |
| numeric | GDP | Sacramento—Roseville—Arden-Arcade | 0 | 1.0 | 139047.10 | 29141.410 | 101242 | 118803.50 | 133913.5 | 156776.25 | 189624 | ▇▅▅▂▅ |
| numeric | GDP | Indianapolis-Carmel-Anderson | 0 | 1.0 | 142838.00 | 29480.594 | 111780 | 124158.25 | 132066.5 | 156737.00 | 199198 | ▇▆▂▁▃ |
| numeric | GDP | Orlando-Kissimmee-Sanford | 0 | 1.0 | 143533.10 | 37361.047 | 103535 | 117245.25 | 130224.0 | 160196.00 | 217038 | ▇▆▂▁▃ |
| numeric | GDP | Cincinnati | 0 | 1.0 | 145420.70 | 29427.984 | 115420 | 124413.75 | 134811.0 | 164063.25 | 198889 | ▇▆▁▂▃ |
| numeric | GDP | Pittsburgh | 0 | 1.0 | 150152.50 | 23973.371 | 122439 | 134439.25 | 142655.5 | 164225.75 | 194230 | ▇▃▂▂▃ |
| numeric | GDP | Austin-Round Rock | 0 | 1.0 | 153263.30 | 50125.403 | 103513 | 120201.75 | 132508.0 | 182025.75 | 248110 | ▇▃▁▂▃ |
| numeric | GDP | Portland-Vancouver-Hillsboro | 0 | 1.0 | 156146.80 | 36875.343 | 115405 | 124085.00 | 150555.0 | 181032.25 | 218894 | ▇▃▂▂▃ |
| numeric | GDP | Tampa-St. Petersburg-Clearwater | 0 | 1.0 | 161375.10 | 42584.891 | 117711 | 129864.50 | 147205.0 | 182781.50 | 243268 | ▇▆▂▁▃ |
| numeric | GDP | St. Louis | 0 | 1.0 | 170745.40 | 28370.647 | 142184 | 151844.50 | 159498.5 | 183136.50 | 226549 | ▇▃▂▁▃ |
| numeric | GDP | Charlotte-Concord-Gastonia | 0 | 1.0 | 173845.60 | 42388.148 | 131934 | 144594.25 | 158086.5 | 198365.00 | 255666 | ▇▃▁▃▂ |
| numeric | GDP | Riverside-San Bernardino-Ontario | 0 | 1.0 | 182304.40 | 41710.580 | 132596 | 151960.75 | 172618.5 | 206664.50 | 256859 | ▇▇▂▂▅ |
| numeric | GDP | Baltimore-Columbia-Towson | 0 | 1.0 | 199041.70 | 32803.477 | 162776 | 174528.50 | 189517.0 | 218553.50 | 259690 | ▇▃▂▂▃ |
| numeric | GDP | Denver-Aurora-Lakewood | 0 | 1.0 | 219351.90 | 49242.626 | 167491 | 187883.50 | 202968.0 | 243588.50 | 311876 | ▇▆▂▁▃ |
| numeric | GDP | San Diego-Carlsbad | 0 | 1.0 | 238261.50 | 42433.472 | 189074 | 206638.00 | 227404.0 | 262264.25 | 314943 | ▇▃▂▂▃ |
| numeric | GDP | Detroit-Warren-Dearborn | 0 | 1.0 | 260671.30 | 37503.591 | 215614 | 232585.00 | 255521.0 | 279677.75 | 331333 | ▇▇▅▂▂ |
| numeric | GDP | Phoenix-Mesa-Scottsdale | 0 | 1.0 | 263214.60 | 71082.514 | 195630 | 212589.50 | 236981.0 | 300870.25 | 398129 | ▇▃▂▁▃ |
| numeric | GDP | Minneapolis-St. Paul-Bloomington | 0 | 1.0 | 265508.60 | 45004.358 | 213855 | 234825.25 | 255241.0 | 288649.25 | 350710 | ▆▇▁▂▃ |
| numeric | GDP | San Jose-Sunnyvale-Santa Clara | 0 | 1.0 | 292019.50 | 93304.442 | 180996 | 217475.50 | 264596.5 | 385389.75 | 422817 | ▇▇▁▂▇ |
| numeric | GDP | Miami-Fort Lauderdale-West Palm Beach | 0 | 1.0 | 360767.50 | 90404.744 | 266563 | 298137.00 | 337833.0 | 401545.25 | 533674 | ▇▆▂▁▃ |
| numeric | GDP | Seattle-Tacoma-Bellevue | 0 | 1.0 | 380847.40 | 105528.213 | 267472 | 301309.25 | 345491.5 | 457983.50 | 566742 | ▇▃▂▂▃ |
| numeric | GDP | Atlanta-Sandy Springs-Roswell | 0 | 1.0 | 399632.00 | 94271.520 | 291481 | 331777.50 | 377674.0 | 454682.50 | 570663 | ▆▇▁▂▃ |
| numeric | GDP | Philadelphia-Camden-Wilmington | 0 | 1.0 | 437052.80 | 63231.474 | 364052 | 393117.00 | 419324.5 | 469222.00 | 557601 | ▇▇▅▂▂ |
| numeric | GDP | Boston-Cambridge-Newton | 0 | 1.0 | 454910.50 | 88455.533 | 357087 | 387683.50 | 430463.5 | 514645.75 | 610486 | ▇▃▂▂▃ |
| numeric | GDP | Houston-The Woodlands-Sugar Land | 0 | 1.0 | 492940.20 | 98923.359 | 404431 | 430514.50 | 447003.5 | 522494.00 | 696999 | ▇▁▁▁▁ |
| numeric | GDP | Dallas-Fort Worth-Arlington | 0 | 1.0 | 511866.60 | 125737.185 | 375065 | 426416.50 | 470595.5 | 576877.00 | 744654 | ▇▆▁▂▃ |
| numeric | GDP | San Francisco-Oakland-Berkeley | 0 | 1.0 | 531183.80 | 147080.939 | 364594 | 421725.25 | 489427.0 | 638661.00 | 778878 | ▇▃▂▂▃ |
| numeric | GDP | Washington-Arlington-Alexandria | 0 | 1.0 | 537186.70 | 93965.576 | 442224 | 465655.75 | 507818.5 | 590892.00 | 714685 | ▇▆▁▂▃ |
| numeric | GDP | Chicago-Naperville-Elgin | 0 | 1.0 | 684905.50 | 111672.672 | 561016 | 606612.00 | 650722.0 | 745803.25 | 894862 | ▇▆▁▂▃ |
| numeric | GDP | Los Angeles-Long Beach-Anaheim | 0 | 1.0 | 1001487.50 | 171507.552 | 788081 | 871723.50 | 970357.0 | 1105426.75 | 1295361 | ▇▅▅▂▅ |
| numeric | GDP | New York-Newark-Jersey City | 0 | 1.0 | 1748937.30 | 307735.527 | 1401233 | 1528163.75 | 1666396.5 | 1937664.00 | 2298868 | ▇▃▂▂▃ |
| vars | n | mean | sd | median | trimmed | mad | min | max | range | skew | kurtosis | se | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| X1 | 1 | 50 | 3650713 | 3389683 | 2455120 | 2978284 | 1534909 | 957419 | 20140470 | 19183051 | 2.841426 | 9.811452 | 479373.6 |
Why it matters: Similar to creating reports in R, creating slide decks with R code saves manual effort, enhances reproducibility, and reduces the risk of errors, inaccuracies, or outdated information in key presentations.
How it works: The officer R package enables easy creation of slides integrating visualizations and tables directly from R code.
Note: You can also create PowerPoint decks by setting up Quarto or R Markdown files with a PowerPoint output directly (stay tuned for more information in another WAIAI tutorial)!
#open powerpoint deck
deck <- officer::read_pptx(path = "./template.pptx") %>% officer::remove_slide(index = 1) #create new powerpoint using template saved in this folder and remove blank first slide
deck <- officer::layout_default(deck, "Title and Content") #update default slide type
#add title slide
deck <- officer::add_slide(deck, layout = "Title Slide") #add new slide to open deck with title layout
deck <- officer::ph_with(deck, "U.S. Metropolitan Statistical Area GDP over time", location = ph_location_type(type = "ctrTitle")) #add and center title text
deck <- officer::ph_with(deck, "By Ginny Ulichney", location = ph_location_type(type = "subTitle")) #add and center subtitle text
#clean up table to be added to slide, create top 10 table
GDP_top10_table <- GDP_table %>%
dplyr::slice_max(numeric.mean, n = 10) %>% #choose top 10 MSAs by GDP from ascending table
dplyr::select(c(MSA, numeric.mean, numeric.sd)) %>% #select only columns we want in pptx
rename("Mean" = "numeric.mean",
"SD" = "numeric.sd") #rename columns to look nicer in table
#add table slide
deck <- add_slide(deck) #add new slide for table
deck <- ph_with(deck, "Top 10 MSAs by mean GDP (millions of dollars) via Wikipedia, 2012-2023", location = ph_location_type(type = "title")) #add title to table slide
deck <- ph_with(deck, value = GDP_top10_table, location = ph_location_type(type = "body")) #add table to slide; updates made to table will automatically be reflected when re-compiling powerpoint
#add plot slide 1
deck <- officer::add_slide(deck)
deck <- officer::ph_with(deck, "GDP increased over 2012-2023 for most U.S. Metropolitan Statistical Areas (top 50)", location = ph_location_type(type = "title")) #add title to first plot slide
deck <- officer::ph_with(deck, value = lineplot, location = ph_location_type(type = "body")) #add line plot from above to slide; again, any updates made to plot will automatically be reflected
#add plot slide 2
deck <- officer::add_slide(deck)
deck <- officer::ph_with(deck, "2020 Population and 2021 GDP were positively associated across U.S. Metropolitan Statistical Areas (top 50)", location = ph_location_type(type = "title")) #add title to first plot slide
deck <- officer::ph_with(deck, value = scatterplot, location = ph_location_type(type = "body")) #add correlation scatterplot to slide
#save deck
print(deck, target = "./example_GDP_MSA_deck.pptx") #save finished deck to directory
In this workshop, we discussed a few fundamental concepts in R and used R code to:
1: Scrape real-time, openly-accessible data
2: Create instant, interactive reports
3: Automatically generate PowerPoint decks for easy updating and reproducibility
In up-coming videos, we’ll discuss some more advanced things you can do in R, such as:
Gather insights from unstructured text data using text analysis
Create interactive plots to enhance clarity in complex data visualizations
Generate data dashboards (and slide decks!) directly in Quarto
R Core Team (2025). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/.
Wikipedia contributors. (2025, June 5). List of United States metropolitan areas by GDP. In Wikipedia, The Free Encyclopedia. Retrieved 17:03, September 17, 2025, from https://en.wikipedia.org/w/index.php?title=List_of_United_States_metropolitan_areas_by_GDP&oldid=1294080196
Wickham H, Averick M, Bryan J, Chang W, McGowan LD, François R, Grolemund G, Hayes A, Henry L, Hester J, Kuhn M, Pedersen TL, Miller E, Bache SM, Müller K, Ooms J, Robinson D, Seidel DP, Spinu V, Takahashi K, Vaughan D, Wilke C, Woo K, Yutani H (2019). “Welcome to the tidyverse.” Journal of Open Source Software, 4(43), 1686. doi:10.21105/joss.01686 https://doi.org/10.21105/joss.01686.
Wickham H (2024). rvest: Easily Harvest (Scrape) Web Pages. doi:10.32614/CRAN.package.rvest https://doi.org/10.32614/CRAN.package.rvest, R package version 1.0.4, https://CRAN.R-project.org/package=rvest.
William Revelle (2025). psych: Procedures for Psychological, Psychometric, and Personality Research. Northwestern University, Evanston, Illinois. R package version 2.5.3, https://CRAN.R-project.org/package=psych.
Kassambara A (2023). ggpubr: ‘ggplot2’ Based Publication Ready Plots. doi:10.32614/CRAN.package.ggpubr https://doi.org/10.32614/CRAN.package.ggpubr, R package version 0.6.0, https://CRAN.R-project.org/package=ggpubr.
C. Sievert. Interactive Web-Based Data Visualization with R, plotly, and shiny. Chapman and Hall/CRC Florida, 2020.
Gohel D, Moog S, Heckmann M (2025). officer: Manipulation of Microsoft Word and PowerPoint Documents. doi:10.32614/CRAN.package.officer https://doi.org/10.32614/CRAN.package.officer, R package version 0.6.10, https://CRAN.R-project.org/package=officer.
Mills BR (2022). MetBrewer: Color Palettes Inspired by Works at the Metropolitan Museum of Art. doi:10.32614/CRAN.package.MetBrewer https://doi.org/10.32614/CRAN.package.MetBrewer, R package version 0.2.0, https://CRAN.R-project.org/package=MetBrewer.
Allaire J, Xie Y, Dervieux C, McPherson J, Luraschi J, Ushey K, Atkins A, Wickham H, Cheng J, Chang W, Iannone R (2024). rmarkdown: Dynamic Documents for R. R package version 2.29, https://github.com/rstudio/rmarkdown.
Xie Y, Allaire J, Grolemund G (2018). R Markdown: The Definitive Guide. Chapman and Hall/CRC, Boca Raton, Florida. ISBN 9781138359338, https://bookdown.org/yihui/rmarkdown.
Xie Y, Dervieux C, Riederer E (2020). R Markdown Cookbook. Chapman and Hall/CRC, Boca Raton, Florida. ISBN 9780367563837, https://bookdown.org/yihui/rmarkdown-cookbook.
Xie Y (2025). knitr: A General-Purpose Package for Dynamic Report Generation in R. R package version 1.50, https://yihui.org/knitr/.
Yihui Xie (2015) Dynamic Documents with R and knitr. 2nd edition. Chapman and Hall/CRC. ISBN 978-1498716963
Yihui Xie (2014) knitr: A Comprehensive Tool for Reproducible Research in R. In Victoria Stodden, Friedrich Leisch and Roger D. Peng, editors, Implementing Reproducible Computational Research. Chapman and Hall/CRC. ISBN 978-1466561595
Perepolkin D (2023). polite: Be Nice on the Web. doi:10.32614/CRAN.package.polite https://doi.org/10.32614/CRAN.package.polite, R package version 0.1.3, https://CRAN.R-project.org/package=polite
Posit team (2025). RStudio: Integrated Development Environment for R. Posit Software, PBC, Boston, MA. URL http://www.posit.co/.
## ─ Session info ───────────────────────────────────────────────────────────────
## setting value
## version R version 4.5.0 (2025-04-11)
## os macOS Sequoia 15.6
## system aarch64, darwin20
## ui X11
## language (EN)
## collate en_US.UTF-8
## ctype en_US.UTF-8
## tz America/New_York
## date 2025-12-17
## pandoc 3.4 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/aarch64/ (via rmarkdown)
## quarto 1.6.42 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/quarto
##
## ─ Packages ───────────────────────────────────────────────────────────────────
## package * version date (UTC) lib source
## abind 1.4-8 2024-09-12 [1] CRAN (R 4.5.0)
## askpass 1.2.1 2024-10-04 [1] CRAN (R 4.5.0)
## assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.5.0)
## backports 1.5.0 2024-05-23 [1] CRAN (R 4.5.0)
## base64enc 0.1-3 2015-07-28 [1] CRAN (R 4.5.0)
## broom 1.0.8 2025-03-28 [1] CRAN (R 4.5.0)
## bslib 0.9.0 2025-01-30 [1] CRAN (R 4.5.0)
## cachem 1.1.0 2024-05-16 [1] CRAN (R 4.5.0)
## car 3.1-3 2024-09-27 [1] CRAN (R 4.5.0)
## carData 3.0-5 2022-01-06 [1] CRAN (R 4.5.0)
## cli 3.6.5 2025-04-23 [1] CRAN (R 4.5.0)
## crosstalk 1.2.1 2023-11-23 [1] CRAN (R 4.5.0)
## curl 6.2.2 2025-03-24 [1] CRAN (R 4.5.0)
## data.table 1.17.2 2025-05-12 [1] CRAN (R 4.5.0)
## dichromat 2.0-0.1 2022-05-02 [1] CRAN (R 4.5.0)
## digest 0.6.37 2024-08-19 [1] CRAN (R 4.5.0)
## dplyr * 1.1.4 2023-11-17 [1] CRAN (R 4.5.0)
## evaluate 1.0.3 2025-01-10 [1] CRAN (R 4.5.0)
## farver 2.1.2 2024-05-13 [1] CRAN (R 4.5.0)
## fastmap 1.2.0 2024-05-15 [1] CRAN (R 4.5.0)
## forcats * 1.0.0 2023-01-29 [1] CRAN (R 4.5.0)
## Formula 1.2-5 2023-02-24 [1] CRAN (R 4.5.0)
## fs 1.6.6 2025-04-12 [1] CRAN (R 4.5.0)
## generics 0.1.4 2025-05-09 [1] CRAN (R 4.5.0)
## ggplot2 * 4.0.0 2025-09-11 [1] CRAN (R 4.5.0)
## ggpubr * 0.6.0 2023-02-10 [1] CRAN (R 4.5.0)
## ggsignif 0.6.4 2022-10-13 [1] CRAN (R 4.5.0)
## glue 1.8.0 2024-09-30 [1] CRAN (R 4.5.0)
## gtable 0.3.6 2024-10-25 [1] CRAN (R 4.5.0)
## hms 1.1.3 2023-03-21 [1] CRAN (R 4.5.0)
## htmltools 0.5.8.1 2024-04-04 [1] CRAN (R 4.5.0)
## htmlwidgets 1.6.4 2023-12-06 [1] CRAN (R 4.5.0)
## httr 1.4.7 2023-08-15 [1] CRAN (R 4.5.0)
## jquerylib 0.1.4 2021-04-26 [1] CRAN (R 4.5.0)
## jsonlite 2.0.0 2025-03-27 [1] CRAN (R 4.5.0)
## knitr * 1.50 2025-03-16 [1] CRAN (R 4.5.0)
## labeling 0.4.3 2023-08-29 [1] CRAN (R 4.5.0)
## lattice 0.22-6 2024-03-20 [1] CRAN (R 4.5.0)
## lazyeval 0.2.2 2019-03-15 [1] CRAN (R 4.5.0)
## lifecycle 1.0.4 2023-11-07 [1] CRAN (R 4.5.0)
## lubridate * 1.9.4 2024-12-08 [1] CRAN (R 4.5.0)
## magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.5.0)
## Matrix 1.7-3 2025-03-11 [1] CRAN (R 4.5.0)
## memoise 2.0.1 2021-11-26 [1] CRAN (R 4.5.0)
## MetBrewer * 0.2.0 2022-03-21 [1] CRAN (R 4.5.0)
## mgcv 1.9-1 2023-12-21 [1] CRAN (R 4.5.0)
## mime 0.13 2025-03-17 [1] CRAN (R 4.5.0)
## mnormt 2.1.1 2022-09-26 [1] CRAN (R 4.5.0)
## nlme 3.1-168 2025-03-31 [1] CRAN (R 4.5.0)
## officer * 0.6.10 2025-05-30 [1] CRAN (R 4.5.0)
## openssl 2.3.2 2025-02-03 [1] CRAN (R 4.5.0)
## pillar 1.11.0 2025-07-04 [1] CRAN (R 4.5.0)
## pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.5.0)
## plotly * 4.11.0 2025-06-19 [1] CRAN (R 4.5.0)
## polite * 0.1.3 2023-06-30 [1] CRAN (R 4.5.0)
## psych * 2.5.3 2025-03-21 [1] CRAN (R 4.5.0)
## purrr * 1.0.4 2025-02-05 [1] CRAN (R 4.5.0)
## R6 2.6.1 2025-02-15 [1] CRAN (R 4.5.0)
## ragg 1.4.0 2025-04-10 [1] CRAN (R 4.5.0)
## ratelimitr 0.4.1 2018-10-07 [1] CRAN (R 4.5.0)
## RColorBrewer 1.1-3 2022-04-03 [1] CRAN (R 4.5.0)
## Rcpp 1.0.14 2025-01-12 [1] CRAN (R 4.5.0)
## readr * 2.1.5 2024-01-10 [1] CRAN (R 4.5.0)
## repr 1.1.7 2024-03-22 [1] CRAN (R 4.5.0)
## rlang 1.1.6 2025-04-11 [1] CRAN (R 4.5.0)
## rmarkdown * 2.29 2024-11-04 [1] CRAN (R 4.5.0)
## robotstxt 0.7.15 2024-08-29 [1] CRAN (R 4.5.0)
## rstatix 0.7.2 2023-02-01 [1] CRAN (R 4.5.0)
## rstudioapi 0.17.1 2024-10-22 [1] CRAN (R 4.5.0)
## rvest * 1.0.4 2024-02-12 [1] CRAN (R 4.5.0)
## S7 0.2.0 2024-11-07 [1] CRAN (R 4.5.0)
## sass 0.4.10 2025-04-11 [1] CRAN (R 4.5.0)
## scales 1.4.0 2025-04-24 [1] CRAN (R 4.5.0)
## selectr 0.4-2 2019-11-20 [1] CRAN (R 4.5.0)
## sessioninfo 1.2.3 2025-02-05 [1] CRAN (R 4.5.0)
## skimr * 2.1.5 2022-12-23 [1] CRAN (R 4.5.0)
## spiderbar 0.2.5 2023-02-11 [1] CRAN (R 4.5.0)
## stringi 1.8.7 2025-03-27 [1] CRAN (R 4.5.0)
## stringr * 1.5.1 2023-11-14 [1] CRAN (R 4.5.0)
## systemfonts 1.2.3 2025-04-30 [1] CRAN (R 4.5.0)
## textshaping 1.0.1 2025-05-01 [1] CRAN (R 4.5.0)
## tibble * 3.3.0 2025-06-08 [1] CRAN (R 4.5.0)
## tidyr * 1.3.1 2024-01-24 [1] CRAN (R 4.5.0)
## tidyselect 1.2.1 2024-03-11 [1] CRAN (R 4.5.0)
## tidyverse * 2.0.0 2023-02-22 [1] CRAN (R 4.5.0)
## timechange 0.3.0 2024-01-18 [1] CRAN (R 4.5.0)
## tzdb 0.5.0 2025-03-15 [1] CRAN (R 4.5.0)
## usethis 3.1.0 2024-11-26 [1] CRAN (R 4.5.0)
## uuid 1.2-1 2024-07-29 [1] CRAN (R 4.5.0)
## vctrs 0.6.5 2023-12-01 [1] CRAN (R 4.5.0)
## viridisLite 0.4.2 2023-05-02 [1] CRAN (R 4.5.0)
## withr 3.0.2 2024-10-28 [1] CRAN (R 4.5.0)
## xfun 0.52 2025-04-02 [1] CRAN (R 4.5.0)
## xml2 1.3.8 2025-03-14 [1] CRAN (R 4.5.0)
## yaml 2.3.10 2024-07-26 [1] CRAN (R 4.5.0)
## zip 2.3.3 2025-05-13 [1] CRAN (R 4.5.0)
##
## [1] /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/library
## * ── Packages attached to the search path.
##
## ──────────────────────────────────────────────────────────────────────────────